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Any lossless transformation on Us spatial and Up internal modes of light can be described by an 
UsUp X UsUp unitary matrix, but there is no known procedure to effect an arbitrary UaUp x UaUp unitary 
matrix on light in Ua spatial and Up internal modes. We devise an algorithm to realize an arbitrary 
discrete unitary transformation on the combined spatial and internal degrees of freedom of light. 

Our realization uses beamsplitters and operations on internal modes to effect arbitrary linear 
transformations. The number of beamsplitters required to realize a unitary transformation is reduced 
as compared to existing realization by a factor np/2 at the cost of increasing the number of internal 
optical elements by a factor of two. Our algorithm thus enables the optical implementation of higher 
dimensional unitary transformations. 


I. INTRODUCTION 

Linear optics is important in quantum information pro¬ 
cessing. The problem of sampling the output coincidence 
distribution of a linear optical interferometer, i.e., the 
BosonSampling problem, is hard to simulate on a clas¬ 
sical computer [1]. Linear optics enables the efhcient 
simulation of quantum walks EHl]. Single-photon detec¬ 
tors and linear optics allow for efficient universal quantum 
computation 011]. 

Arbitrary linear optical transformations can be real¬ 
ized on various degrees of freedom (DoFs) of light. For 
instance, any 2x2 unitary transformations on the polar¬ 
ization DoF can be decomposed into elementary opera¬ 
tions that are implemented using quarter- and half-wave 
plates [THSj. Any unitary transformation on an arbitrary 
number of spatial modes can be realized as an arrange¬ 
ment of beamsplitters, phase shifters and mirrors [THUT^ 
and of temporal modes using nested fiber loops or disper¬ 
sion [SHin]- Finally, unitary transformations on orbital- 
angular-momentum modes of light can be realized using 
beamsplitters, phase shifters, holograms and extraction 
gates [T6|. 

Experimental implementations employ spatial modes 
of light to perform quantum walks niHini, BosonSam¬ 
pling [ 20 H 24 ] . bosonic transport simulations [25| and pho¬ 
tonic quantum gates [261 - 128] . Implementing linear opti¬ 
cal transformations on n spatial modes requires aligning 
O (n^) beamsplitters m-. this requirement poses the key 
challenge to the scalability of linear optical implementa¬ 
tion of unitary transformations. 

One approach to overcoming the challenge of realizing 
a higher number of modes is to use internal DoFs, such as 
polarization, arrival time and orbital angular momentum, 
in addition to the spatial DoF. In particular, any lossless 
transformation on Ug spatial and Up internal modes can 
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be described by an risUp x ngUp unitary transformation. 
However, there is no known method to effect an arbitrary 
rigUp X ngUp unitary transformation on the state of light 
in Ug spatial and Up internal modes. 

Here we aim to devise an efficient realization of an ar¬ 
bitrary unitary transformation using spatial and internal 
DoFs. By efficient we mean that the cost of realizing 
the transformation, as quantified by the number of re¬ 
quired spatial and internal optical elements, scales no 
faster than a polynomial in the dimension of the trans¬ 
formation. Specifically, we construct an algorithm to 
decompose an arbitrary rigUp x UgUp unitary transforma¬ 
tion into a sequence of O [nl) beamsplitters and O (n^) 
internal transformations, each of which acts only on the 
internal modes of light in one spatial mode. 

In contrast to the Reck et al. approach, which allows 
the realization of any discrete unitary transformation in 
spatial modes alone, our approach enables the realization 
into spatial and internal modes [29] . At the cost of increas¬ 
ing the required number of internal optical elements by 
a factor of two, we reduce the required number of beam¬ 
splitters by a factor of np/2 as compared to the Reck 
et al. method. Another difference between our method 
and the Reck et al. method is that our method requires 
only balanced beamsplitters, which are easier to construct 
accurately [30] . 

Reducing the required number of beamsplitters at the 
cost of increasing the number of optical elements is desir¬ 
able both in free-space and in on-chip implementations 
of linear optical transformations. Free-space implementa¬ 
tions of linear optics require beamsplitters to be stable 
with respect to each other at sub-wavelength length scales. 
On-chip beamsplitters rely on evanescent coupling, which 
requires overcoming the challenge of aligning different 
optical channels. On the other hand, operations on in¬ 
ternal elements do not require mutual stability and are 
typically easier to align and are therefore preferred over 
beamsplitters. 

Moreover, our approach is advantageous experimentally 
because of its flexibility in the choice of Up and Ug. For 
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instance, consider the realization of a 6 x 6 unitary ma¬ 
trix. The Reck et al. approach allows for a realization of 
this transformation on an interferometer with six spatial 
modes. Depending on experimental requirements, our 
approach allows for a realization of the 6x6 transforma¬ 
tions using either (i) six spatial modes (ng = 6 ,np = 1), 
(ii) three spatial and two internal modes, for instance 
polarization (ug = 3 ,np = 2), (iii) two spatial and three 
internal modes (rzg = 2 ,np = 3 ) or (iv) one spatial and 
six internal modes (ug = l,np = 6). 

Our algorithm is based on the iterative use of the cosine- 
sine decomposition (CSD). The relevant background of the 
CSD is presented in Sec. |n] We detail our decomposition 
algorithm in Sec. |III| The cost of realizing an arbitrary 
unitary matrix is presented in Sec. |IV[ We conclude with 
a discussion of our decomposition algorithm in Sec. 

II. BACKGROUND: COSINE-SINE 
DECOMPOSITION 

In this section, we present the relevant background of 
the CSD, which is the key building block of our decom¬ 
position algorithm. We describe the factorization of an 
arbitrary (m + n) x (m-l-n) unitary matrix using the CSD. 
The section concludes with the realization of a 4 x 4 uni¬ 
tary transformation on two spatial and two polarization 
modes of light as enabled by the CSD. 

The CSD factorizes an arbitrary unitary matrix as 
follows [?TH 55 ] . For each (m-l-n) x (m-l-n) unitary matrix 
Um+n, there exist unitary matrices Lm-i-n, IRm-i-n, 

such that 



FIG. 1 . Depiction of the CSD. Um+n is an (m -|- n) x (m -|- n) 
unitary matrix. The CSD factorizes Um+n into the block 
diagonal matrices represented by Lm, L'„, Rm, and a CS 
matrix S2m ([^. 


where A and D are square complex matrices of dimension 
m X m and n x n respectively, and B and C are rectangular 
with respective dimensions m x n and n x m. Each row 
of the matrix {Rm) is a left-singular (right-singular) 
vector of A, as we prove in Appendix]^ Similarly, L'^ and 
are the left- and right-singular vectors of D. Finally, 
{cos^i} is the set of singular values of A. The singular 
vectors and values of any complex matrix can be computed 
efficiently using established numerical techniques [MH 57 ] . 

Now we illustrate the realization of an arbitrary 4 x 4 
unitary matrix as a linear optical transformation on two 
spatial and two polarization modes [ 4 j. The realization is 
enabled by the CSD, which decomposes the given matrix 
C/4 according to 
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\ 



/ cos 01 


sin 01 

def 


cos dm 

sin dm 


— sin 01 


cos 01 


v 

- sin dm 

cos dm 


( 2 ) 


• ( 3 ) 


The decomposition of Um+n into Em+nj S2m and Rm-i-n 
is depicted in Fig. [T] Here and henceforth, the respective 
subscripts of the matrix symbols denote the dimension of 
the matrix. 

The matrices ILm+n, ^2m and Rm+n can be constructed 
using the singular value decomposition as follows. In order 
to perform CSD on Um+m we express it as a 2 x 2 block 
matrix 


Um+n — 


A 

B 

c 
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( 4 ) 


for m = n = 2 as depicted in Fig. [^a). By definition, C/4 
acts on the four-dimensional space 'H4, which we identify 
with the combined space 

7/4 = ni"'’ O H^i'’ (6) 

of spatial and polarization modes. Thus, the 2 x 2 matri¬ 
ces L2 and i?2 are identified with transformations acting 
on the two polarization modes of light in the first spatial 
mode. Likewise, L2 and R2 correspond to transforma¬ 
tions on polarization in the second spatial mode. Each of 
these operators L2+2+L R2 ‘-an be realized with two 
quarter-wave plates, one half-wave plate and one phase 
shifter [ 7115 ]. 

The matrix 84 in Eq. (§ is a CS matrix of the form 
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This matrix can be decomposed further according to 
84(04,02) = (^2 C) l2)(02 © ® I2), 
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FIG. 2. Realization of a 4 x 4 unitary matrix t /4 as a trans¬ 
formation on two spatial and two polarization modes of light, 
(a) The CSD factorizes Ua into the left and right matrices 
L 2 , L 2 , RI, R 2 and the CS matrix S4 ([^. (b) The left and 
right matrices are realized as combinations of quarter- and 
half-wave plates, and the CS matrix is realized using two 
balanced beamsplitters and a two wave plates. 


where 


=71 (10 ’ 

e, s: (*'“■ /.,) . (10) 

The transformation {B2 G) I2) in Eq. Q represents bal¬ 
anced beamsplitters, whereas, the transformations 02002 
can be realized using wave plates acting separately on the 
polarization of light in the two spatial mode. Figure [^b) 
depicts the optical circuit for the realization of C/4 using 
beamsplitters, phase shifters and wave-plates. 

Although the realization of arbitrary 4 x 4 transfor¬ 
mations on two spatial and two polarization modes is 
known [ 4 ], there is no known realization of an arbitrary 
UgUp X UgUp transformation on rig spatial and Up internal 
modes. In the next section, we present a decomposition 
algorithm to enable this realization. 


III. ALGORITHM TO DESIGN EFFICIENT 
REALIZATION 


Here we describe the algorithm to decompose an ar¬ 
bitrary unitary matrix into beamsplitter and internal 
transformations. Our algorithm is in two parts. First, we 


decompose the given unitary matrix into internal trans¬ 
formations and CS matrices. Next we factorize the CS 
matrices into beamsplitter and internal transformations. 
MATLAB code for the CSD and for our decomposition 
algorithm is available online | 38 ) . 

This section is structured as follows. Subsection IIII Al 
details the inputs and outputs of the decomposition algo¬ 
rithm. The step-by-step decomposition of the unitary into 
internal and CS matrices is presented in Subsection [III B 
The factorization of the CS matrices into elementary op¬ 
erations is described in Subsection IIII Cl 


A. Inputs and outputs of algorithm 

Here we present the inputs and outputs of our decompo¬ 
sition algorithm. Our algorithm receives an UgUp x UgUp 
unitary matrix as an input. The algorithm returns a 
sequence of matrices, each of which describes either a 
beamsplitter acting on two-spatial modes or an internal 
unitary operation, which acts on the internal DoF in one 
spatial modes while leaving the other modes unchanged. 
The remainder of this subsection describes the basis and 
the form of the matrices yielded by our algorithm. 

The operators returned by the algorithm act on the 
combined space 

'H = %s®'Hp, (11) 

where Hg and T-Lp are spanned 

Us = span{|si), |s2), • • ■, |sn,)}, (12) 

Tip = span{|pi), IP2), ■ • ■, \PnJ} ( 13 ) 

by the rig spatial modes and the Up internal modes re¬ 
spectively for positive integers rig and Up. Each operator 
acting on the combined state of light can be represented 
by an UgUp x UgUp matrix in the combined basis 

{\ckl) = |sfc) 0 \pi) : fc G { 1 ,... ,ng}, £ G { 1 ,.. .,np}} 

of the spatial and the internal modes. Our algorithm 
returns the matrix representations of the operators in this 
combined basis {\ckt)}- 

The matrices returned by the algorithm represent either 
internal or beamsplitter transformations. Each internal 
transformation acts on the internal state of light in a 
spatial mode but not on the light in the other spatial 
modes. In the composite basis, the internal transforma¬ 
tions acting on the fc-th spatial mode are represented 
as 

I/W = tnAk-l) ® Un, 0 tnAn.-k) (M) 

for Up X Up unitary matrix Un^. 

The algorithm also returns beamsplitter matrices, 
which mix each of the corresponding internal modes of 
light in two spatial modes. The matrix representation of 
this operator in the composite basis is given by 

= lnp(/c-l) 0 (^2 0 Inp) 0 lnp(n,-/c-l) ( 15 ) 
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for B 2 as defined in Eq. representing a balanced beam¬ 
splitter. To summarize, the algorithm returns a sequence 
of matrices, each of which is an internal transformation 


in the form of Eq. (141 or is a balanced beamsplitter 


transformation in the form of Eq. (151. 


B. Decomposition of unitry matrix into internal 
and CS matrices 

In this subsection, we present the first stage of our al¬ 
gorithm. This stage decomposes the given unitary matrix 
into matrices representing internal transformations (14) 
and CS transformations 


—lnp(/c-l) ® S2np(^l, ■ • • , On^,) 


©Ip 


(16) 


sultant 


j{ns — k—l) 5 


right unitary matrices and acting on 

the remaining Ug — j spatial modes. Figure]^ a) depicts 
this first CSD for the first iteration. 

Next the matrix L', is CS decomposed. The re- 

from this second CSD commutes with 
CS matrix S2„p yielded by the first CSD |33]. Hence, the 
operators • ,, and S2n can be swapped, following 

{TLs J -LJ'T'p P 

which we multiply by Figuregb) 

depicts this second round of CSD and of the multiplication 
of the two right matrices. 

The left unitary matrices thus obtained are repeatedly 
factorized using the CSD. The resultant right unitary 
matrices are absorbed into the initial right unitary ma¬ 
trix R'} . Thus, we are left with internal and CS 

matrices and with a unitary matrix 


which enact the CS matrix S2nj, = S2„j 

on the internal degrees of light in two spatial modes 

without affecting the light in other modes. 

The first stage comprises — 1 iterations. Of these, 
the first iteration factorizes the given UgUp x UgUp unitary 
matrix into a sequence of internal and CS matrices and 
one (rig — l)np x (rig — \)np unitary matrix. This smaller 
unitary matrix is factorized in the next iteration. Figure]^ 
depicts the first of the rig — 1 iterations that comprise the 
first stage. 

In general, the j-th iteration receives an (rig + l —j)np x 
(rig + 1 — j)np unitary matrix. This iteration decomposes 
the received unitary matrix into a sequence of internal 
and CS matrices, and a smaller (ng — j)np x (rig — j)np 
unitary matrix which is decomposed in the next iteration. 

Now we describe the j-th iteration of the decomposi¬ 
tion algorithm in detail. First, the given unitary matrix 
U(n^+i-j)np is CS decomposed by setting m = Up and 
n = {us— j)np in the CSD. This CSD yields the following 
sequence of matrices 

^{np + l-j)np = Ijnp-|-(r!,s-i)np (S2np ® ^(rip-1-j)np ) 

^ ®'np + (ns—j)np I (1'^) 

for block diagonal unitary matrices 

^np + {ns—j)np = 


®-np-|-(na-j)™p 

and orthogonal CS matrix S2np- 

In other words, the first CSD of the j-th iteration 
factorizes the received unitary transformation acting on 
rig + I—j spatial modes into (i) a 2np x 2np CS matrix S2n 
acting on the j-th and (j+l)-th spatial modes, (ii) internal 
unitary matrices and , each of which act on the 
internal degrees of the j-th spatial mode and (hi) left and 



"s-i-i 


U{n^-j)np “ n 


{ris-j-i 


(19) 


i=0 


obtained by multiplying each of the right unitary matrices. 
This completes a description of the j-th iteration of the 
algorithm. 

In summary, at the end of the j-th iteration, the al¬ 
gorithm decomposes the received Ui^np+i-j)np transfor¬ 
mation into internal and CS matrices and U(^n^_j'jnp as 
depicted in Fig. [^c). The (j + l)-th iteration of the 
algorithm receives this smaller unitary matrix 

and decomposes it into internal and CS matrices and an 
even smaller unitary matrix. The algorithm iterates over 
integral values of j ranging from 1 to rig — 1. Figure]^ 
depicts the output of the algorithm at the end of the final, 
i.e., (rig — l)-th, iteration. This completes a description 
of the first stage of the algorithm. 

At the end of the first stage, the given unitary matrix 
has been factorized into a sequence of internal (14) and 
CS matrices (§. The internal matrices can be imple¬ 
mented using optical elements if a suitable realization is 
known for the internal DoF; such realizations are known 
for polarization laiH], temporal [13] and orbital-angular- 
momentum m DoFs, n the next subsection, we present 
a realization of the CS matrix using beamsplitters acting 
on spatial modes and internal transformations. 


C. Decomposition of CS unitary matrix into 
elementary operators 

Here we show how the CS matrices can be decom¬ 
posed into a sequence of beamsplitter transformations 
and internal unitary matrices. Specifically, we construct 
a factorization of any 2np x 2np CS matrix S2np, which 
is in the form of Eq. ([^, into a sequence of two balanced 
beamsplitter matrices and two internal-transformation 
matrices. 

Our decomposition of the CS matrix relies on the fol- 
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FIG. 3. A depiction of the first iteration of our algorithm 
for the decomposition of a given unitary Unsnp into internal 
(green) and CS (brown) matrices, (a) First, the Un^up unitary 
matrix is CS decomposed into (i) a 2np x 2np CS matrix 82 ^^ 
acting on the first two spatial modes, (ii) internal unitary 
matrices Lnp and R^p\ each of which act on the internal 
degrees of the first spatial mode and (iii) left and right unitary 
matrices and acting on the remaining 

ria — 1 spatial modes, (b) The matrix is further 

CS decomposed. The resultant from the second 

decomposition commutes with CS matrix 82 ^^^^ and can thus 
be absorbed into ■ (c) The algorithm repeatedly 

decomposes the left unitary matrices. The resultant right 
unitary matrices are absorbed into the initial right unitary 
matrix. At the end of one iteration, the algorithm decomposes 
UrisTip unitary operation into CS matrices, internal unitary 
matrices and the matrix 17(n,_i)„p. The next iteration of the 
algorithm decomposes the smaller unitary matrix. 


lowing identity 

S 2 np( 01 , . . . , dnj = (62 0 KpJiOup © Olp)(Sl © 1„J, 

( 20 ) 

where B 2 © lnp represents a balanced beamsplitter © 
and 



is a transformation on the internal modes. Thus, any CS 


matrix can be realized using two balanced beamsplitters 
and two internal transformations. 

To summarize, the hrst stage of our algorithm decom¬ 
poses the given unitary matrix into internal (14) and CS 


matrices (161. The next stage factorizes the CS matrices 
returned by the first stage into internal and beamsplit¬ 
ter (15) transformations, thereby completing our decom¬ 
position algorithm. 


IV. COST ANALYSIS: NUMBER OF OPTICAL 
ELEMENTS IN REALIZATION 

Here we discuss the cost of realizing an arbitrary 
UsUp X UsUp unitary matrix using our decomposition, 
where the cost is quantified by the number of optical 
elements required to implement the matrix. Optical ele¬ 
ments required by our decomposition algorithm include 
balanced beamsplitters, phase shifters and elements act¬ 
ing on internal modes. We conclude this section with a 
specific example of decomposing a 2nx 2n transformation 
into spatial and polarization DoFs. In this case, our de¬ 
composition reduces the required number of beamsplitters 
to half with the additional requirement of wave plates as 
compared to using only spatial modes. 

Consider the decomposition of an arbitrary UsUp x n^np 
unitary transformation. Realization of this transforma¬ 
tion using the Reck et al. method requires UgUp spatial 
modes and rig rip (rig np — l)/2 biased beamsplitters [TD]. In 
comparison, our decomposition requires Usiris — 1) beam¬ 
splitters. Thus, we reduce the number of beamsplitters 
required to realize an UgUp x UgUp transformation by a 
factor of 


77 = 


nsnp{nsnp - l)/2 

rig (rig - 1) 


> nll2. 


( 22 ) 


Although our decomposition reduces the required num¬ 
ber of beamsplitters, the number of optical elements re¬ 
quired for internal transformations increases by a factor 
of 2. The Reck et al. approach requires nsnp{nsnp -|- l)/2 
phase shifters to effect an rigrip x UgUp unitary transfor¬ 
mation on spatial modes. 

Our approach relies on decomposing to beamsplit¬ 
ter and internal unitary transformations, so we count 
the number of internal optical elements required in 
our transformation. Realizing an rip x Up internal 
transformation typically requires rip internal optical ele¬ 
ments [3 [131 [16]. Our decomposition requires ri^ arbitrary 
internal transformations, which are represented by matri¬ 
ces {Lnp, Rpp, R'n^} in the output. These arbitrary 
transformations can be realized using a total of ri^rip inter¬ 
nal optical elements. Furthermore, our decomposition also 
requires ns(ns — 1) internal transformations in the form of 
Qnp ( [ 2 I] ). Each of these transformations can be realized 
using Tip optical elements for the polarization, temporal 
and orbital angular momentum modes |40j . In summary, 
our decomposition requires a total of ngnp(ngnp -|- rig — 1), 
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FIG. 4. A depiction of the output of the first stage of our decomposition algorithm (Subsection |III B I for the case of ris = 4 spatial 
modes and Up internal modes. The given 4np x iup unitary matrix is decomposed into 4^ = 16 internal matrices (green) and 
na{ns — l)/2 = 6 CS matrices (brown). As usual, the right subscript of the matrices is the dimension of the space that the 
respective operators act on. The right superscript represents the spatial mode that the operators act on. The left subscript 
specifies the index of iteration that constructed the respective matrices. 


which is an increase by a factor 


nsnp{nsnp + - 1 ) 

nsnp{nsnp + l)/2 


2 + 0(l/np) 


(23) 


over the cost of the Reck et al. approach. 

Now we consider the example of using polarization as 
the internal DoF. Specifically, we compare the cost of 
realizing an arbitrary 2n x 2n transformation using (i) the 
Reck et al. approach on only spatial modes and (ii) our 
decomposition on the spatial and polarization modes of 
light, i.e., Us = n and Up = 2. The Reck et al. decompo¬ 
sition requires 2n spatial modes, n{2n — 1) beamsplitters 
and n(2n + l) phase shifters. In comparison, our approach 
requires n(n—1) balanced beamsplitters, phase shifters 
and 3n(n — l)/2 wave plates. Thus, our decomposition 
reduces the required number of beamsplitters and phase 
shifter by a factor of 2 each at the expense of an additional 
3n(n — l)/2 wave plates. 

To summarize this section, our realization of an arbi¬ 
trary risUp X UgUp unitary matrix reduces the number of 
beamsplitters required by a factor of at least Up. This 
completes the analysis of the cost of our decomposition. 


V. CONCLUSION 

In conclusion, we devise a procedure to efficiently real¬ 
ize any given rignp x ngUp unitary transformation on 
spatial and np internal modes of light. Our realization 
uses interferometers composed of beamsplitters and opti¬ 
cal devices that act on internal modes to effect the given 
transformation. Such interferometers can be characterized 
by using existing procedures [CT |42] based on one- and 
two-photon interference on spatial and internal DoFs 031- 
03]. We thus enable the design and characterization of 
linear optics on multiple degrees of freedom. 

We overcome the problem of decomposing the given 
unitary transformation into internal transformations by 
performing the CSD iteratively. We also open the pos¬ 
sibility of using an efficient iterative CSD in problems 
where the single-shot CSD is currently used 07105] . 

By employing Up internal modes, the number of beam¬ 
splitters required to effect the transformation is reduced 


by a factor of rip/2 at the cost of increasing the num¬ 
ber of internal elements by a factor of 2. Our procedure 
facilitates the realization of higher dimensional unitary 
transformations for quantum information processing tasks 
such as linear optical quantum computation, BosonSam- 
pling and quantum walks. 
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Appendix A: Construction 

In this appendix, we present our construction of the 
CSD. Recall that our CSD procedure is a building block of 
our main decomposition algorithm, which is discussed in 
Section [Hlj Although this procedure matches the output 
of existing procedures [311 [33], our procedure emphasizes 
the key role of the singular value decomposition in the 
CSD. Furthermore, numerical implementations of our 
CSD procedure are expected to be more efficient and 
stable as compared to existing procedures because of 
the efficiency and stability of established singular-value- 
decomposition algorithms [341135]. Note that efficiency of 
numerical implementations refers to the computational 
cost of performing the decomposition and differs from the 
requirement of efficient realization, which deals with the 
number of optical elements required to experimentally 
realize the matrices. 

First, we recall that the singular value decomposition 
factorizes any m x n complex matrix M into the form 

M = Wk^V^ (Al) 

for m X m unitary matrix IF, n x n unitary matrix V 
and real non-negative diagonal matrix . The matrices 
W and V diagonalize M and M respectively. In 
other words, the rows of W and V are the eigenvectors 
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of M M'f and M. These rows are called the left- and 
right-singular vectors of M. 

Now we describe the CSD of a given {m + n) x {m + n) 
unitary matrix U. In order to perform CSD of this matrix, 
we express it as a 2 x 2 block matrix 


U = 


A 

B 

C 

D 

B, 

c, 


(A2) 


m X m,n X m,m X n and n x n respectively. From the 
unitarity of U, we have 


= 

U^U-. 



AC^ +BD^ 

II _ 

CC^ + DD^ 

A^B + C'fD 

y B^'A p D'^C 

B^B + DW 


= t^+n, (A3) 
= lm+n- (A4) 


Considering the blocks on the diagonals of 
obtain the matrix equations 


Eqs. (A31, we 


= (A5) 

= 1„. (A6) 

Equations (A51 and ( |A6[ ) imply that 

[AA\BB^] = 0, (A7) 

[CC\DD'>] = 0, (A8) 


i.e., AA^ commutes with B B^ and C commutes with 
D . Furthermore, AA^ and B B^ are normal matri¬ 
ces. Hence, AA^ and B B^ are diagonalized by the same 
matrix; or A and B have the same (up to a phase) left- 
singular vectors, denoted by the unitary matrix Lm- From 
Eq. (A8), C and D have the same left-singular vectors, 
denoted by £(, . 

Erom Eq. (A4), we have 


AU + C'1'C' = 1^, (A9) 

+ = (AlO) 


Following the same line of reasoning as the one used for 
obtaining common left-singular vectors, we observe that 
matrices A and C have the same right-singular vectors, 
say R,n , and B and D have the same right-singular vectors 

K. 

The left- and right-singular vectors of the matrices 
{A, B, C, D} can be employed to diagonalize these ma¬ 
trices according to 

A = (All) 

B = L^A^B'l (A12) 

C = KA^BL (A13) 

D = L'^A^R!l (AM) 

for diagonal complex matrices {A"^, A^, A*^, A'®}. The 
matrices consisting of the absolute values of the corre¬ 
sponding complex elements of {A"^, A^, A*^, A^} matrices 


are denoted by |A^|, |A^|, jA^^I and |A^| and comprise 
the singular values of A, B, C and D matrices respec¬ 
tively. Equations ( |A11[ ) to ( |A14[ ) can be combined into a 
single (m + n) x (m + n) matrix equation 


A 

B\ f Lm 


A® 

C 

\d) 

B'n) 


A^ 


17 = E 


.A. 


771+n^ m+n+n 


1 R„ 



(A15) 


Eactorization (A15) is similar to the CSD because Ijm+n 
and Rm+n block-diagonal unitary matrices and Am+n 
comprises diagonal blocks. In the remainder of this ap¬ 
pendix, we show that A,„+„ can be brought into the form 
of a CS matrix ([^, thereby completing the construction 
of the CSD. 

If the matrices Lm {L'^) and Rm {R'n) calculated 
from the singular value decomposition of A (D), then A^ 
(A^) is a real and non-negative diagonal matrix. The 
matrices Lm, L'^, Rm and R'^ also diagonalize the matrices 
C and D resulting in A^ and A^. Unlike A'^ and A^, 
which consist of real elements, these matrices A^ and 
A^ are complex matrices in general. In other words, the 
diagonal matrices A^ and A'^ are of the form 


A® = P|A^| 

A^ = -|A<='|P^ 


(AI6) 


where P is an m x m diagonal unitary matrix. The 
phases Pjj in Eq. (AI6) for C are complex conjugates of 


the phases for B because of the unitarity of A. 

We can remove the matrix P from A^ and A^ by 
redefining Lm and Rm as 


Lm — LmP, 
Rm — RmR- 


(AI7) 

(AI8) 


Thus, the Eq. (AI5|) can be rewritten as: 
U = ‘ 


L' 


A^ 

lA^IA 

( P^Rl 




\ 

hr 


(AI9) 


or 


R — ^m+nAm-\-n^m-\-n ■ 


(A20) 


Note that the matrix Am+n comprises only real elements. 
Furthermore, Am+n is unitary because it is a product 
Am+n = '^^m+nU^m+n- Hence, Xm+n IS an orthogonal 
matrix. 

The orthogonality of the A implies that any two rows 
and any two columns of the matrix are orthogonal. There¬ 
fore, the 2x2 block matrices 


A _ ( Ai^i A^y-i-m 

~ 1 A A 

■‘^7 + 777 , 7+777 

is also an orthogonal matrix. Any 2x2 orthogonal matrix 
is of the form 


(A21) 


/ cos 6i 
y— sin 9i 


sin 9i 
cos 9i 


(A22) 




































for 1 < i < m. 

Next we consider the case oi i > m. For the matrix A® 
all the columns with the index i > m are zero. Similarly, 
for the matrix all the rows with the index i > m are 
zero. From the unitarity of Am+n, we see that each of the 
diagonal elements in the last n — m columns and rows of 
the matrix A^ is unity. In summary, the matrix Am_|_„ is 
of the form 

^m+n = S2m ® In-m (A23) 


for S 2 m a CS matrix in the form of Eq. (|^. 


This completes our procedure for factorizing a given 
unitary matrix using the CSD. matlab code for our CSD 
procedure is available online [38] . 
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